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Abstract 

Garbage  collection  (GC)  is  an  important  part  of  many  language  implementations.  One  of  the  most  important 
garbage  collection  techniques  is  copying  GC.  This  paper  consists  of  an  informal  but  abstract  description  of 
copying  collection,  a  formal  specification  of  copying  collection  written  in  the  Larch  Shared  Language  and 
the  Larch/C  Interface  Language,  a  simple  implementation  of  a  copying  collector  written  in  C,  an  informal 
proof  that  the  implementation  satisfies  the  specification,  and  a  discussion  of  how  the  specification  applies  to 
other  types  of  copying  GC  such  as  generational  copying  collectors.  Limited  familiarity  with  copying  GC  or 
Larch  is  needed  to  read  the  specification. 
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1.  Introduction 


Automatic  storage  reclamation,  or  garbage  collection  is  an  important  service  provided  by  many  language 
implementations  [11].  There  are  two  miyor  techniques  used  for  garbage  collection,  reference  counting  and 
tracing.  Reference  counting  requires  explicitly  accounting  for  the  number  of  references  to  each  data  item. 
Tracing  collectors  trace  the  pointer  griq>h  to  find  the  reachable  data.  The  two  major  variants  of  the  tracing 
approach  are  Mark  and  Sweep  collectors  (MSGC)  [7],  and  Copying  collectors  (CGC)  [3].  Mark  and  Sweep 
collectors  mark  the  data  reachable  from  the  roots  as  they  trace  out  the  pointer  graph.  They  then  “sweep 
up”  the  unmarked  data  into  a  free  list  for  reallocation.  Copying  collection  copies  the  reachable  data  of  the 
graph  to  an  unused  portion  of  memory,  leaving  the  garbage  behind. 

This  paper  presents  a  Larch  specification  and  a  simple  implementation  of  copying  collection  as  well  as 
an  informal  proof  that  the  implementation  satisfies  the  specification.  The  specification  itself  is  composed  of 
two  parts,  one  in  the  Larch  Shared  Language  (LSL)  which  is  used  to  specify  general  properties  of  CGC,  and 
one  in  the  Larch/C  Language  (LCL)  which  uses  the  LSL  traits  to  specify  all  of  the  C  routines  used  in  the 
implementation.  The  specification  should  be  readable  even  by  those  not  familiar  with  Larch.  For  a  general 
introduction  to  LSL  see  [6]  [5]  and  to  LCL  [4].  Both  the  LSL  and  LCL  specifications  have  been  s}mtax  and 
t}rpe  checked,  although  no  ^ort  has  been  m^e  at  formal  verification. 

I  do  not  further  consider  Mark  and  Sweep  collection  in  detail.  However  since  it  is  also  based  on  tracing 
the  pointer  graph,  those  portions  of  the  specification  that  deal  with  tracing  apply  to  it  as  well.  Reference 
counting  is  not  addressed  at  adl. 

The  paper  begins  with  a  very  abstract  description  of  how  a  tracing  collector  works,  followed  by  a  de¬ 
scription  of  the  standard  implementation  techniques  used  for  copying  collection.  Next,  I  present  the  LSL 
and  LCL  specifications,  followed  by  the  implementation  along  with  informal  proofs  that  the  implementation 
satisfies  the  specification.  Finally,  I  discuss  the  applicabUify  of  the  specification  to  several  important  variants 
of  CGC,  and  some  related  work  on  formalising  GC. 


2.  Tracing  and  Copying  Collection 


The  pointers  contained  in  data  form  a  directed  graph,  where  the  data  are  the  nodes  and  the  pointers  are  the 
edges.  Any  portion  of  this  graph  that  a  program  cannot  reach  by  dereferencing  pointers  is  inaccessible  to  the 
program.  Such  inaccessible  data  is  called  garbage  and  can  be  reallocated,  while  any  data  that  is  Kcessible  is 
called  live  and  must  be  preserved.  Tacing  collectors  find  the  live  data  by  computing  the  transitive  closure 
of  the  points-to  relation  starting  from  the  set  of  known  live  data,  called  roots.  The  differences  among  tracing 
collectors  lie  in  what  algorithm  is  used  to  compute  the  transitive  closure,  and  what  is  done  to  the  live  data 
when  they  are  found  by  the  algorithm.  Algorithms  for  computing  transitive  closures  are  graph  searching 
algorithms,  and  not  surprisingly  MSGC  uses  a  depth-first  search,  and  CGC  a  breadth-first  search.  (But  see 
Section  5  for  some  exceptions  to  this  rule.)  Both  may  use  clever  representation  techniques  to  avoid  using 
extra  storage  beyond  that  needed  for  the  data  while  computing  the  transitive  closure. 

The  following  is  a  general  graph  searching  algorithm.  Nodes  are  divided  into  two  disjoint  sets:  the  seen 
nodes,  which  are  known  to  be  in  the  transitive  closure,  and  the  unseen  nodes,  which  may  or  may  not  be. 
The  seen  nodes  are  further  divided  into  two  disjoint  sets;  the  visited  nodes,  which  have  had  the  nodes  they 
refer  to  added  to  the  seen  set,  and  the  unvisiied  nodes,  which  have  not.  The  algorithm  starts  by  placing  all 
the  roots  in  the  unvisited  set:  all  other  nodes  are  in  the  unseen  set.  It  proceeds  by  selecting  some  member 
of  the  unvisited  set,  adding  the  nodes  that  it  refers  to  that  are  unseen  to  the  unvisited  set,  and  then  adding 
the  node  to  the  visited  set.  When  the  unvisited  set  is  empty,  the  algorithm  terminates,  and  all  of  the  live 
nodes  are  in  the  visited  set.  Depth-first  search  of  the  graph  results  &om  managing  the  unvisited  set  as  a 
stack  and  breadth-first  search  results  from  managing  it  as  a  queue. 

In  addition  to  performing  some  variwt  of  the  algorithm  above,  tracing  collectors  perform  some  additional 
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aetioiis  wh«ii  »  node  is  added  to  the  eeen  eet.  For  MSGC  this  coniists  of  marking  the  node  so  that  the 
reachable  nodes  be  distinguished  from  the  unreachable  ones  during  the  sweep  phase.  For  CGC  this 
consists  of  copying  the  node  to  a  new  location  in  memory.  Since  other  nodes  magr  still  refer  to  the  ori^al 
node,  when  a  node  is  copied  the  original  node  must  be  modified  so  that  the  fact  that  it  has  been  copied  can 
be  detected,  and  where  it  has  been  copied  to  can  be  found.  This  is  usually  done  by  marking  the  node  as 
Torwarded”  using  a  tag  and  writing  a  forwarding  pointer  into  the  data  indicating  where  it  was  copied  to. 

The  usefulness  of  CGC  comes  in  part  &om  the  use  of  a  clever  encoding  of  the  unseen,  unvisited  and 
visited  sets  so  that  no  more  memory  is  used  by  the  algorithm  than  is  needed  to  copy  just  the  live  data.  The 
unseen  and  seen  sets  ate  encoded  by  placing  them  in  different  portions  of  memory.  From-apact  holds  the 
unseen  set  and  is  where  data  is  copied  fitom;  io-apaee  holds  the  seen  set  and  is  where  the  data  is  copied  to. 
Typically  CGC  vints  the  data  in  the  graph  in  a  breadth-first  manner,  and  thus  the  unvisited  set  must  form 
a  queue.  To  effect  a  queue,  CGC  uses  two  pointers  into  to-space,  the  unacanned  pointer  and  the  acanned 
pointer.  The  unscanned  pointer  points  to  the  first  location  of  to-space  that  is  unused  and  it  forms  the  tail  of 
the  queue.  Data  is  added  to  the  seen  set  by  copying  it  to  the  location  referred  to  by  the  unscanned  pointer. 
The  scanned  pointer  points  to  the  location  of  the  first  unvisited  node,  and  forms  the  head  of  the  queue. 
Because  of  the  use  of  unscanned  and  scanned  pointers,  CGC  terminology  generally  uses  the  term  unscanned 
for  unvisited,  and  scanned  for  visited. 

The  standard  CGC  algorithm  is  known  as  the  Cheney  scan  [1].  It  utilises  three  basic  operations:  copying, 
forwarding,  and  acanning.  Copying  copies  a  node  to  the  location  referred  to  by  the  unscanned  pointer  suid 
sets  the  unscanned  pointer  to  refer  to  the  first  location  after  the  newly  copied  data.  It  also  modifies  the 
original  data  to  record  the  fact  that  the  data  has  been  copied,  as  well  as  the  location  it  was  copied  to.  This 
is  exactly  the  act  of  adding  the  node  to  the  seen  set.  Forwarding  modifies  a  pointer  to  from-space  data  so 
that  it  refers  to  the  to-space  copy  of  the  data.  If  the  node  has  not  yet  been  copied,  it  copies  it.  Scanning 
a  node  forwards  each  pointer  in  the  node  and  advances  the  scanned  pointer  so  that  it  refers  to  the  next 
node  in  to-space.  Since  forwarding  guarantees  that  a  node  has  been  copied,  acanning  corresponds  directly 
to  adding  the  node  to  the  visited  set.  In  addition  acanning  guarantees  that  no  pointers  into  from-space  are 
found  in  scanned  nodes. 

Given  the  operations  and  data  structures  above,  the  actual  garbage  collection  algorithm  is  very  simple. 
When  the  user  program  (known  as  the  mutator)  runs  out  of  storage,  the  garbage  collector  is  called.  The 
roots  are  defined  in  an  implementation  dependent  manner,  and  the  unscanned  and  scanned  pointers  are 
directed  at  the  beginning  of  to-space.  Next,  each  root  is  forwarded.  This  causes  all  directly  reachable  nodes 
to  be  copied  into  the  unscanned  (seen  and  unvisited)  set  and  updates  the  roots  so  that  they  point  to  the  new 
copies.  It  does  not  change  the  scaumed  pointer.  Now  the  node  pointed  to  by  the  scanned  pointer  is  scanned 
and  the  scanned  pointer  is  advanced  past  the  newly  scanned  node.  This  is  repeated  until  the  scanned  pointer 
equals  the  unscanned  pointer,  which  indicates  the  queue  is  empty.  When  this  happens  the  roles  of  the  two 
spaces  are  exchanged  (‘^pped”)  and  the  mutator  can  resume.  This  process  examines  each  live  node  twice, 
once  to  copy  it  and  once  to  scan  it,  and  thus  the  cost  of  the  algorithm  is  proportional  to  the  number  of  live 
nodes.  The  live  nodes  are  copied  into  a  contiguous  region  of  memory,  which  serves  to  compact  memory. 


3.  The  Specification 

All  of  the  key  concepts  and  terminology  needed  to  understand  the  specification  have  been  introduced.  The 
specification  itself  is  made  up  of  two  kinds  of  components,  LSL  traits  and  LCL  interfaces.  The  LSL  traits 
define  sorts  and  functions  at  a  high  level  of  abstraction  and  form  the  vocabulary  used  in  the  interfaces. 
The  LCL  interfaces  specify  pre-conditions  that  must  be  satisfied  before  the  routine  m<^  be  used,  and  post¬ 
conditions  that  the  routine  must  guarantee  upon  termination. 

I  first  present  the  traits  containing  the  key  sorts  and  some  important  general  functions.  Then  I  present 
the  LCL  interfaces  in  a  top-down  fashion  along  with  the  supporting  LSL  traits.  The  Appendix  contains 
several  of  the  less  important  traits  which  I  do  not  discuss  here. 
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Many  ai  Um  LSL  fonetioiia  taka  the  fonn  op(aig,  ug*),  which  epedfies  a  relation  between  a  pre-state 
and  a  post-state.  In  LCL  interfaces  pre-states  are  notated  with  a*  and  post-states  with  a  * . 


3.1.  Address  tradt 
Addrest :  trait 

includes  5et(A,  5A)  %  Sets  of  Addresses 


Figure  1:  The  Address  T^ait 

Addresses  (A)  are  used  to  ‘Hndex”  memory.  They  can  only  be  compared  for  equality,  since  no  other  operations 
are  defined  on  them. 


3.2.  Node  trait 


Node  :  trait 

includes  Addresa 
includes  5ef  (  Val,  SV) 
includes  Sei{N,  SN) 

N  tuple  of  id  :  UID,  addra  :  SA,  vala 


SV 


Figure  2:  The  Node  TVait 


%  Sets  of  Values 
%  Sets  of  Nodes 


Nodes  (N)  are  the  basic  data  items  of  the  specification.  They  consist  of  a  unique  identifier,  a  set  of  addresses 
that  are  the  addresses  of  the  other  nodes  ‘pointed  to”  by  the  node,  and  a  set  of  values  representing  the 
non-pointer  data  in  a  node. 


3.3.  Memory  trait 

A  memory  (M)  (figure  3)  consists  of  four  sets  of  addresses  and  two  maps.  Roota  is  the  set  of  root  addresses. 
Uncopied,  unacanned,  scanned  are  the  sets  of  addresses  that  are  uncopied,  unscanned,  and  scanned.  Collec¬ 
tively  unscanned  and  scanned  are  the  addresses  that  have  been  copied.  The  mem  map  maps  addresses  to 
nodes,  while  the  forwarded  map  maps  the  original  address  of  a  copied  node  to  its  new  address. 

iaValidMemorp  captures  the  notion  that  a  memory  is  well-formed.  It  is  an  invariant  of  all  the  LCL 
interfaces.  Since  it  is  the  first  fimction  we  have  seen,  and  an  important  one  as  well,  let’s  examine  it  in 
detail.  The  line 

iaOneToOne{m.mem)  A  iaOneToOne{m.forwarded) 

says  that  only  one  node  can  be  located  at  any  given  address  in  memory,  and  that  only  one  node  can  be 
forwarded  to  any  given  address.  The  line 
ia  ValidAddrSei(m.  roota ,  m) 

says  that  all  the  roots  are  addresses  located  in  memory.  The  lines 
m.nncopied  D  m.unacanned  =  {} 
m.nncopied  D  m.aeanned  =  {} 
m.mnaeanned  0  m.scsnned  =  {} 

say  that  the  uncopied,  unscanned,  and  scanned  sets  ate  ail  disjoint.  The  line 
m.nncopied  H  domain{m.forwarded)  —  {} 
says  that  no  uncopied  address  has  been  forwarded.  The  line 
m.nnacannedyjm.acanned  =  range{m.forwarded) 

says  that  the  addresses  which  have  been  copied  are  exactly  those  which  are  mapped  to  by  the  forwarded 
map. 
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MemoryMain  :  trait 
includes  Node 

Includes  FinHeMappiHfAax(ANMap,A,  N,  SA  tor  SDomain) 

includes  FiniieMappmpA%x(AAiiaptA,A,SA  tor  SRange,  SA  for  SDomain) 

includes  Te3tSetIArg(is  ValidAddr,  it  ValidAddrSei,  A.,  SA,  M) 

M  tuple  of  Tooia  :  SA, 
uncopied  :  SA, 
unseanned  :  SA, 
scanned  :  SA, 
mem  :  ANMap, 
forwarded  :  AAMap 
introduces 

isValidMemorp  :  M  — »  Boot 
is  ValidAddr  :  A,  M  —*■  Bool 
effeciiveAddr  :  A,  M  -*  A 

asserts 

V  m  :  M,a  :  A,n  :  N 

isValidMemorpim)  == 

isOneToOne(m.mem)  A  isOneToOne{m.forwarded) 
AisValidAddrSet(m.roois,  m) 

Am.uncopied  O  m.unseanned  s  {} 

Am.uncopied  O  m.scanned  =  {} 

Am.unscanned  H  m.scanned  =  {} 

Am.sncopteii  n  domain(m./orwarded)  =  {} 

Am.  unseanned  U  m.scanned  s  range{m.forwarded) 

Am.uncopied  U  m.unseanned  U  m.scanned 
=  domain(m.mem) 

isValidAddr(a,  m)  =:=  if  defined(m. forwarded,  a) 
then  defined(m.mem,m.forwarded[a]) 
else  defined{m.mem,a) 

effectiveAddr{a,  m)  =s  if  defined{m. forwarded,  a) 
then  m.forwarded[a\ 
else  a 

implies 

converts  is  ValidMemorp,  is  ValidAddr,  isValidAddrSet,  effeciiveAddr 

Figure  3:  The  MemoryMsin  TVait 
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Finally  m.uncopieJ  U  m.unseanmed  U  m.aeanned  =  iomaiu(m.mem) 
says  that  all  nodes  are  referred  to  by  an  address  in  the  uncopied,  unscanned,  or  scanned  sets. 

tfftctwtAddv  translates  unforwarded  address  to  forwuded  ones,  if  the  node  has  been  copied.  The  Mem- 
oryAuxiliary  trait,  found  in  the  appendix,  defines  many  simple  functions  involving  memory,  mostly  serving 
to  improve  the  specification’s  readability- 


3.4.  Equiv  trait 


Equiv  :  trait 

includes  Memory 

includes  PairwueElementTeatSArg^iaEquivAddr,  A,  A,  SA,  SA ,  M,  M, 
addraEquiv  for  allPaas) 
introduces 

iaEquivAddr  :  A,  A,  M,M  -*  Bool 
uEquivNode  :  -*  Bool 

memEquiv  :  M,  M  —*  Bool 

asserts 

V  m,  m' :  M,  a,  a'  :  A,n,n'  :  N 
uEquivAddr(a,  o! ,m,  m')  == 

effectiveAddr(a,  m')  =  effeciiveAddr{a' ,  m') 
AiaEquivNode(nodeAtAddr(a,  m),  nodeAtAddr(a' ,  m'),  m,  m') 


i8EquivNode(n,  n' ,  m,  m')  ==: 
n.id  =  n'  .id 
An.vaU  =  n'.vala 

AaddrsEquiv(n.addrs,  n'.addn,m,  m') 


memEq*iv(m,  m')  == 

iaValidMemory(m)  A  isValidMemory{m') 
AaddraEquiv{m.rooU,  m'.rooU,  m,  m') 
AaddrsEqniv(allNode${m),  allNodes{m'),m,  m') 

implies 

converts  isEquivAddr,  iaEquivNode,  addrsEquiv,  memEquiv 

Figure  4;  The  Equiv  'ftait 


The  Equiv  trait  captures  the  notion  of  equivalence  between  two  addresses,  two  nodes  or  two  memories.  Two 
addresses  are  equivalent  if  they  are  equal  or  one  is  the  forwarded  version  of  the  other  and  the  nodes  they 
point  to  are  equivalent.  Two  nodes  are  equivalent  if  they  have  the  same  UID  and  values  and  if  the  addresses 
contained  in  them  are  equivalent.  Two  memories  are  equivalent  if  they  are  both  well  formed  and  their  roots 
and  all  the  nodes  are  equivalent.  In  addition  to  the  functions  directly  defined,  the  function  addrsEquiv  is 
defined  by  including  the  trait  PairwiseElementTeatiArg  with  the  function  isEquivAddr,  which  is  used  to  test 
that  all  elements  of  one  set  have  equivalent  addresses  in  another. 

;  ii»ii  f-ur 
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lUadiabl*  trait 


BetduMe  :  trait 
IneludflB  Memory 
introducaa 

nackahlt :  SA,  M  -*  SA 
ri:SA,SA,M-*SA 

'i  m  :  M,a  :  A,  •$,  Ml,  a$i  :  SA 

mehmUe(M,m)  ==s  ri({},  at,  m) 

ri(M,{},m)  ==  at 
ri(<ui ,  tnser((a,  0*3),  m)  == 

(ua  U  (m.mem[a]).aJ<(r«)  —  inttrt{a,  a«i),m) 

impUw 

converts  naehable,ri 


Figure  S:  The  Reachable  T^ait 

The  Reachable  trait  is  the  heart  of  the  apedflcation:  all  data  reachable  from  the  roots  is  live.  Reachability 
is  the  transitive  closure  of  the  V^i^ts  to”  relation  starting  from  some  given  set  of  addresses,  rtachablt 
is  defined  using  the  helper  function  rl.  The  first  two  arguments  to  rl  are  the  visited  and  unvisited  sets 
respectively,  reachable  invokes  rl  with  the  initial  addresses  in  the  unvisited  set.  The  main  2u:tion  of  rl  is  to 
transfer  nodes  from  the  unvisited  to  the  visited  set.  When  a  node  is  transferred  to  the  visited  set,  all  the 
addresses  directly  referred  to  by  it  are  added  to  the  unvisited  set  minus  any  addresses  already  in  the  visited 
set.  When  the  unvisited  set  is  empty,  rl  is  done.  No  order  of  addition  to  either  set  is  implied,  and  thus  rl 
does  not  specify  any  fixed  search  order. 


3.6.  GC 

• 

This  section  begins  the  presentation  of  the  main  body  of  the  specification.  The  specification  b  presented 
in  a  top-down  fashion.  Both  the  LCL  interfaces  and  LSL  traits  are  discussed.  I  typically  present  several 
LCL  interfaces  that  share  a  common  trait,  followed  by  the  trait  itself.  Thb  allows  the  reader  to  see  how  a 
function  b  used  before  seeing  the  detaib  of  the  function  itself.  A  functions  name  should  give  some  insight 
into  its  semantics. 

Isiports  basa; 

uses  GCCneaorj  <or  N,  addr  tor  A  ); 

void  ge(void)  neaory  < 

requiros  isInitlalNoaoryfnoB* } ; 

■odilies 
onauroa.  < 

iamiOCfBOB*,  M*) 


Figure  6:  gc  Interface 

ge  is  the  primary  interface  to  the  garbage  collector.  It  performs  a  garbage  collection  but  stops  before 
the  spacpa  are  “flipped” .  The  pre-condition  b  that  the  memory  be  in  its  pre-gc  state,  i.e.,  essentially  that 
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nothing  is  yet  copied.  The  post-condition  is  that  all  the  reachable  data  have  been  copied  and  the  memory 
is  in  its  post-ge  state. 


iaq^orts  base; 

nses  OCCneaMry  for  M,  addr  for  A) ; 

void  flnalisoGCCvoid)  Boaory  aesi;  < 
ro<{airos  isFiaalGClfaaoryCaaa*): 

aodifies  aea; 
easures 

isZaitialMaaoryCaaa* ) 

A  aea* . scanned  ■  aea* .nneopied 
A  aea*. roots  «  aea '.roots 

A  aea*. aea  »  aea*  . aea; 


Figure  7;  flnalixeGC  Interface 

finalizeGC  “flips”  the  spaces.  The  pre-condition  is  that  a  GC  has  just  completed,  and  the  post-condition 
requires  that  the  implementation  ensure  that  the  memory  is  in  a  state  where  the  mutator  can  resume. 

GC  :  trait 

includes  Memory 
includes  Eqrtiv 
includes  Reachable 
introduces 

isFullGC  :  M,M  -*  Bool 
ulnitialMemory  :  M  — »  Bool 
isFinalGCMemoiTf  :  M  —*  Bool 
asserts 

V  m,  m' :  M 

isFitUGC{m,  m')  == 
iBlnHialMemory{m) 

AiaFinalGCMemoTTf{m') 

AmemEquiv{m,  m') 

Aaddr$Equiv{reachable{m.rooU,m),  m'. scanned,  m,  m') 

isInUialMemorjf{m)  == 
is  VaUdMemor${m) 

A{}  =  m.nnscanned 
A{}  3=  m.scanned 
A{}  s;  m.fomarded 

isFinalGCMemory{m)  == 
is  ValidMemorjtim) 

A{}  =  m.nnscanned 
A{}  =  roots  Unforwarded{m) 

implies 

converts  isFnUGC,  isIniiialMemory,  isFinalGCMemory 

Figure  8:  GC  Ikait 

The  GC  trait  captures  the  essential  requirements  of  a  copying  collector.  Initially  the  memory  must  be 
entirely  uncopied.  When  a  GC  completes,  all  the  reachable  data  must  have  been  copied  to  the  scanned  set. 
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Um  toota  updated,  the  unecanned  wt  empty,  aod  otherwiee  the  memories  are  still  equivalent.  All  unreachable 
nodes  are  left  in  the  uneopied  set. 


3.7.  Roots 
iaports  base; 

uses  ac(aMBory  for  N,  addr  for  i); 

void  for«ardBoots(void)  nsaory  neai;  { 
requires  isXnitialNeaoryCBea*): 

■odifios  nea; 
ensures 

O  »  rootsUnf or«arded(aea' ) 

A  aea*. roots  •  aea* .unseaaned 
A  O  *  aea*. scanned 
A  aeaBquivCaea” ,  aea'); 


Figure  9:  forwardRoota  Interfetce 

forwardRooU  is  responsible  for  forwarding  the  roots.  The  pre-condition  is  that  the  memory  has  not 
yet  had  anything  copied.  The  poet-condition  is  that  all  of  the  roots  have  been  forwarded  atnd  are  in  the 
unscanned  set  but  that  otherwise  memory  is  unchanged. 


iaports  base; 

addr  nertUnfor«ardedlU>ot(void)  aeaory  aea;  i 
requires  isValidNsaoryCaea") ; 
ensures  if  O  «  rootsUnf or«arded(aea*) 
then  result  s  allL 

else  result  \in  rootsUnforvardedfaea*); 


Figure  10:  nextUnforwardedRoot  interface 

ntxtUnfomardtdRooi  returns  an  unforwarded  root  if  one  exists,  aNil  otherwise.  aNxl  is  just  a  user  defined 
LCL  constant  for  a  nil  address. 

uses  Forward  (aeaory  for  M,  addr  for  A  ); 

void  forvardRootAddrCaddr  *a)  aeaory  aea;  < 
requires 

is7alidlfeBory(aea*)  A  (.*•■)"  \in  rootsUnf orvardedCaea*) ; 
aodifies  aea,  *a; 
ensures 

aea*. roots  ■  (aea*. roots  -  {(•a)*»  \U  {(♦a)’> 

A  isForwardStep((*a}*,  (*a)’,  aea*,  aea*); 

> 


Figure  11:  The  forwardRootAddr  interface 
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forwardRooiAidr  forwards  a  tingle  root.  The  pre-condition  is  that  the  address  be  an  unforwarded  root. 
The  post-condition  it  that  the  address  is  forwarded,  and  that  its  new  value  replaces  the  old  value  in  the 
roots.  uForwariStep  is  d^ned  in  the  Forward  trait  found  below  in  figure  17. 


3.8.  Scanning 


imports  base; 


void  seanOnseaaaed(void)  nosMry  mem;  { 
requires 

O  «  rootsUnforwardedCnaa*) 

A  mem*. roots  •  aea*.uatcanned 
AO*  men*. scanned 
A  isValidNenory(nen*) : 
modifies  men; 
ensures 

men*. roots  *  men ’.roots 
A  non* .untcanned  a  O 
A  nenEquivCnon*.  men’) 

A  addrsBqulv(reacbable(nen*. roots,  non"), 
men’ .scanned,  men*,  nen’); 


} 


Figure  12:  The  scanUnscanned  Interface 

McanUnseanned  completes  the  transitive  closure  calculation  starting  from  the  forwarded  roots.  It  requires 
that  the  roots  all  be  forwarded  and  that  nothing  is  scanned.  The  post-condition  is  that  scanning  is  complete, 
and  that  all  nodes  reachable  from  the  initial  roots  have  been  copied  and  scanned,  but  that  otherwise  the 
memories  are  equivalent. 


inpo^s  base; 

addr  neztUnscannedlode(void)  nenory  men;  { 
requires  isValidMenoryCnen*) ; 
ensures  if  O  *  nen*. unscanned 
then  result  3  allL 
else  result  \in  men* .unscanned; 


Figure  13:  The  nextUnscannedNode  interface 

neztUnacannedNode  must  return  am  unscanned  node  unless  there  are  none  left,  in  which  case  it  must 
return  aNIL. 

acanAddr  (figure  14)  scans  a  single  address.  The  pre-condition  is  that  the  address  be  unscanned  and  that 
the  nodes  reachable  from  the  roots  also  be  reachable  from  the  copied  set.  The  post-condition  is-  that  the 
address  has  been  scanned,  and  that  the  nodes  reachable  from  the  roots  are  still  reachable  from  the  copied 
set.  New  nodes  mi^  have  been  added  to  the  copied  set. 
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iaport*  bas«: 

WM  SeaaCMBorj  for  N,  addr  for  i); 


void  aeanAddrCoddr  a)  ummrj  { 

roquiroa 

laValidMaM>ry(MB‘) 

A  addrObacaanodCa,  aoB**) 

A  addraEq;aiv(raac]iabla(BaB‘‘.roota,  ■#■*), 

roacliabla(copiadIodaa(aaa‘) ,  wm"), 
■•a*,  aaa*): 

aodiflaa  aaa; 
anaoraa 

laScasStapCa,  aoa“,  aaa*) 

A  addraBqiiiv(raachabla(aaa~.roota,  aaa*), 

raacbabla(copladIodaa(aaa* ) ,  aaa* ) , 

aaa*.  aaa*); 

> 


Figure  14;  The  scanAddr  interface 


Scan  ;  trait 

includes  Memory 
includes  Forward 
introduces 

isScannedAddr  :  A,M  —*  Bool 

isScanStep  :  A,M,M  -*  Bool 

asserts 

'i  m,m*  :  M,  a  :  A 

i$SeannedAddr[a,  m)  ss 
addrScanned(a,  m) 

AUForwardedAddr{a,  a,  m,  m) 

A  isFortoanfedSel  ( (m.  mem[a]) .  addra, 

{m.mem[a]).addrs,m,m) 

iaScanStep(a,  m,  m')  == 
memEqniv{m,  m') 

Am.rooU  =  m'.roota 
AaddrUnacanned{a,  m) 

AaddrScanned{a,  m') 

^i»ForwaTdedSvt{{m.mem[a]).addra, 

{m' .mem[a]).addr$,m,  m') 

hnpUas 

'i  m,m'  :  M,a  :  A 

i$ScanSiep{,a,  m,  m')  ^  iaScannedAddri^a,  m*) 

converts  ioScannedAddr,  iaScanStep 

Figure  15:  The  Scan  trait 

The  Scan  trait  defines  functions  used  to  describe  scanning.  The  function  iaScannedAddr  is  true  if  the 
address  is  in  the  scanned  set,  and  if  L  oth  it  and  its  references  have  been  forwarded.  The  function  iaScanSlep 
relates  two  memories  that  differ  only  in  that  one  step  of  scanning  has  occurred.  This  means  that  the 
-  forwarded  address  of  the  node  is  added  to  the  scanned  set,  and  that  all  of  its  referents  are  scanned.  iaScanStep 
in.  .oScannedAddr. 
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3.9.  Forwarding 


iaports  baa«; 

usas  ForvardCnanory  for  N,  addr  for  A); 

void  for*ardAddr(addr  *a)  aonory  aon;  { 
roqniroa  isValidXoaoryCaoa*); 
aodifias  a«a,  *a: 
oasaroa 

if  isFor«ardodAddr((*a)*,  (*a)',  noa*,  aoaO 

thon  (oa)*  •  (oa)*  A  aoa“  »  aoa* 

olso  isFor«ardStap((*a)",  (*a)*,  aaa‘.  aoa*): 

> 


Figure  16:  The  forward Addr  interface 

forwardAddr  forwards  an  address  if  it  has  not  already  been  forwarded.  If  it  has  been  forwarded,  then 
nothing  changes.  The  post-condition  is  that  an  unforwarded  address  is  forwuded.  Allowing  forwardAddr 
to  be  applied  to  already  forwarded  addresses  gives  additional  flexibility  to  the  speciflcation  which  will  be 
discussed  later. 

Forward  :  trait 

includes  Memory 
includes  Copy 

includes  PairwiseElemeniTest2Arg{isForwardedAddr,  A,  A,  SA,  SA,  M,  M, 
iaForwardedSei  for  allPaas) 
introduces 

iaForwardedAddr  :  A,A,M,M  —*•  Bool 
iaForwardStep  :  A,  A,  M,  M  —*  Bool 

asserts 

V  m,  m'  :  M,  a,a* :  A 

isForwardedAddr{a,  a',  m,  m')  == 
iaCopiedAddr{a,  a',  m,  m') 

AeffectiveAddr{a,  m')  =  a' 

uForwardSiep{a,  a\  m,  m')  == 
memEquiv{m,  m') 

AaddrUnforwarded{a,  m) 

AaddrForwarded{a' ,  m') 

A(addrUncopied(a,  m)  ^  iaCopySiep{a,m,  m')) 

AiaCopiedAddr(a,  a',  m,  m') 

Am' .forwarded\a^  =  a' 

Am.seanned  =  m'  .scanned 

implies 

'i  Tn,m'  :  M,  a,a'  :  A 

isForwardStep{a,  a',m,  m')  ^  isForwardedAddr(a,  a',m,  m') 

converts  iaForwardedAddr,  iaForwardStep 

Figure  17;  The  Forward  trait 

The  Forward  trait  defines  functions  used  to  describe  forwarding.  iaForwardedAddr  says  that  the  address 
nnist  be  copied,  and  that  at  least  the  post  version  of  the  address  (a’)  must  refer  to  the  copied  version  of 
the  node.  The  indirectly  defined  iaForwardedSei  function  says  that  for  every  address  in  one  set  of  addresses, 
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some  addiesB  in  the  other  set  satisfies  isForwmrdedAddr.  isForwardStep  relates  an  unforwarded  address  and 
a  memory  to  a  forwarded  address,  and  a  memory  in  which  only  the  changes  needed  to  forward  the  address 
have  occurred.  If  the  address  has  not  been  copied,  it  is.  The  new  address  refers  to  the  copy. 


3.10.  Copying 
inports  base; 

uses  CopyCaeswry  for  N,  addr  for  1  ); 

void  copyAddrCaddr  a)  nenory  aen;  { 
requires 

isValidNeaory (aen* ) 

A  addrUncopiedfa.  aoa*); 

Bodifies  aea; 

ensures  lsCopyStep(a,  aea*.  aea*): 

> 


Figure  18:  The  copy  Addr  interface 

coppAddr  copies  an  uncopied  address  to  a  free  location.  No  other  changes  are  made  to  memory. 

Copy  :  trait 

includes  Memory 
includes  Eqpiv 
introduces 

isCopiedAddr  :  A,  A,  Jlf ,  M  — »  Bool 

isCopyStep  :  A,M,M  -*  Bool 

asserts 

'i  m,m' :  M,a,a'  :  A 

isCopiedAddr{a,  a',  m,  m')  == 
uEq%ivAddr{a,  a',  m,  m') 

AaddrCopied(effectiveAddr(a' ,  m'),  m') 

uCopyStep(a,  m,  m')  == 
memEqviv{m,  m') 

AaddrUncopied{a,  m) 

AaddrPree{{m'  .forroarded[a]),  m) 

AsddrffnforwsTded{m'.forwarded[a],  m) 

Am'  =  [m.rooU, 
deleie(a,  m.uncopied), 
ituert(m'./ortparded[a],m.urucanned), 
m.scanned, 

rebind(m.mem,a,  m\forwarded{a]), 
hind{m. forwarded ,  a,  m' .forwarded[a])] 

implies 

'i  m,m'  :  M,  a  :  A 

i$CopyStep{a,m,  m')  =>  isCopiedAddr{a,a,m,  m') 

converts  isCopiedAddr,  isCopyStep 

Figure  19:  The  Opy  trait 

The  Copy  trait  defines  isCopiedAddr  and  isCopyStep.  isCopiedAddr  is  true  if  a  and  a  ’  are  equivalent. 
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and  the  node  they  refer  to  has  been  copied  in  m  isCopySiep  says  a  is  uncopied  and  the  address  it  is  to  be 
copied  to  is  &ee  and  unfbrwarded.  The  memory  after  copying  (m ^  is  related  to  memory  before  copying  (m) 
in  the  foUovring  way;  the  roots  and  scanned  seta  are  unchanged,  a  is  removed  from  the  uncopied  set,  and  its 
new  location  added  to  the  uncopied  set,  the  node  referred  to  by  a  is  now  found  at  the  new  address,  and  the 
forwarded  map  has  a  bound  to  its  new  location.  iaCopySiep  only  constrains  the  new  location  of  the  node  to 
be  free  but  says  nothing  about  nodes  being  copied  to  contiguous  addresses. 


4.  Implementatioii  and  Informal  Proof  of  Correctness 

The  implementation  is  simple,  designed  to  be  short  and  easy  to  understand  without  sacrificing  any  details 
fundamental  to  the  algorithm.  All  nodes  are  "cons”  cells  containing  no  data  fields  and  two  pointer  fields, 
C2ir  and  cdr.  Space  is  allocated  for  the  forwarding  pointer  explicitly  rather  than  using  some  part  of  the  node 
data  as  probably  would  be  done  in  a  real  collector.  Data  representation  issues  such  as  tagging  pointers,  node 
lengths,  etc.,  while  important  in  a  real  language  implementation,  are  not  essential  to  capturing  the  essence 
of  the  copying  collection  algorithm  and  are  thus  ignored. 

Originally  I  had  not  planned  on  proving  the  implementation  correct,  even  in  the  informal  manner  done 
here.  However  as  the  specification  proceeded,  I  found  it  very  difficult  to  convince  myself  that  I  had  both 
included  and  excluded  the  right  things.  Informally  verifying  the  implementation  caused  me  to  make  signif¬ 
icant  modifications  to  both  the  LSL  and  LCL  portions  of  the  specification,  and  gave  me  greatly  increased 
confidence  that  the  specifications  are  essentially  correct.  At  this  point  only  a  complete  formed  verification 
would  increase  my  confidence  significantly,  and  even  then  I  would  be  surprised  if  it  induced  more  than  minor 
modifications. 

The  presentation  follows  the  same  top-down  order  as  that  of  the  specification.  First,  I  present  the 
implementation’s  representation  memory  in  the  form  of  the  include  file  gc.h.  This  is  followed  by  a  discussion 
of  the  abstraction  function  which  maps  between  the  representation  of  memory  used  in  the  implementation 
and  that  used  in  the  specification,  as  well  as  an  invariant  which  must  be  preserved  by  the  implementation. 
This  invariant  is  needed  for  some  of  the  proofs.  I  then  present  the  implementation  of  each  of  the  interfaces, 
along  with  the  informal  proof  that  it  satisfies  its  specification.  Unfortunately,  this  portion  of  the  paper  is 
difficult  to  read  as  it  requires  frequent  back  references  to  the  specifications.  The  complete  specification, 
found  in  the  Appendix,  may  be  easier  to  refer  to  than  the  specifications  in  the  previous  section.  The  driver 
code  used  to  test  the  garbage  collector  is  omitted. 


4.1.  The  Representation  of  Memory 


The  include  file  gc.h  captures  the  implementation’s  representation  of  memory  and  plays  the  same  role  as  the 
Address,  Node,  and  Memory  traits  (figures  1,  2,  3). 

tdsfins  ■axlunRoots  4 
tdslins  naxlunlodes  12 

typsdsf  int  addr; 

typsdsf  sniui  <COIS,  FVD>  tag_t; 

typed*!  struct  { 
t*g_t  tag; 
addr  fed; 

addr  car;  ^ 

addr  cdr; 

>  node; 
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t]fp«d«f  straet  < 

■ddr  roots  DMLXlwioots] ; 
nod#  toDMZlsBlodos]: 
nods  Irwi  hsxlflodos] ; 
addr  anscsa&od: 
sddr  scannod; 
addr  alloc; 
addr  naxt_root: 

>  Baaory; 

aztam  const  addr  allL; 


Addnsata  are  simply  indices  into  arrays.  Nodea  are  structs  with  fields  for  a  tag,  a  forwarding  address, 
a  car  address,  and  a  cdr  address.  If  the  tag  is  CONS  then  the  node  is  uncopied  and  the  car  and  cdr  field 
hold  valid  pointers.  If  the  tag  is  FWD  then  the  node  has  been  copied  and  the  fwd  field  holds  the  to-space 
address  of  the  copy.  The  memory  struct  closely  mirrors  the  Mentors  trait.  The  root  array  holds  the  roots; 
only  elements  which  are  not  aNil  are  actually  roots.  The  to  and  &om  arrays  form  to-space  and  from-space 
and  together  make  up  the  mem  and  forward  maps.  Any  node  in  from-space  which  has  a  tag  FWD  is  part  of 
the  fwd  map,  while  all  other  &om-space  nodes  and  all  to-space  nodes  are  part  of  the  mem  map.  To-space  is 
divided  into  the  scanned  and  unscanned  seta  by  the  scanned  pointer,  while  the  next  free  location  in  to-space 
is  indicated  by  the  unscanned  pointer.  Next-root  is  used  during  the  forwarding  of  the  roots  to  keep  track 
of  the  next  root  to  forward.  Alloc  indicates  the  next  free  location  during  mutation,  and  bounds  the  valid 
nodes  in  from-space. 

Addresses  are  just  integers.  Thus  it  is  impossible  to  tell  if  an  address  should  be  used  as  an  index  into 
the  from  array  or  the  to  array  just  by  examining  it.  Because  of  this  ambiguity  the  implementation  must  be 
careful  to  keep  track  of  which  array  an  address  refers  to.  This  gives  rise  to  an  important  set  of  invariants 
which  the  implementation  must  maintain.  For  the  root  array  the  addresses  located  at  indices  in  the  range 
[O..next.root)  refer  to  the  to  array,  while  those  at  indices  in  the  range  [next.root..maxjoots)  refer  to  the 
from  array.  (The  notation  [m..n)  denotes  the  set  of  addresses  including  m,  but  excluding  n.)  In  the  from 
array,  nodes  with  tag  CONS  contain  references  into  the  from  array  in  their  car  and  cdr  fields,  and  nodes  with 
tag  FWD  contain  references  into  the  to  array  in  their  fwd  held.  In  the  to  array,  all  nodes  at  addresses  in  the 
range  [0.. scanned)  are  forwarded  and  contain  only  references  into  the  to  may,  while  all  nodes  at  addresses 
in  the  range  [scanned,  .uncopied)  are  unforwarded  and  contain  only  references  into  the  from  array.  These 
conditions  are  invariants  and  each  routine  in  the  implenwntation  may  assume  they  hold  at  the  beginning  of 
its  execution  and  must  guarantee  that  they  hold  at  the  end.  The  prooft  will  argue  that  these  conditions  are 
maintained. 

Now  consider  the  correspondence  between  the  implementation  and  the  specification  representations  of 
addresses,  nodes  and  memory  in  a  somewhat  more  formal  light.  The  ambiguity  noted  above  implies  that 
implementation  addresses  do  not  uniquely  correspond  to  addresses  in  the  specification.  The  invarients  given 
above  allow  us  to  disambiguate.  An  implementation  node  with  a  tag  of  CONS  corresponds  directly  to  a 
node  in  the  specification,  with  the  car  and  cdr  fields  making  up  the  address  set  of  the  specification  node. 
The  implementation  representation  does  not  contain  an  explicit  UID  and  the  set  of  values  is  empty.  Now 
consider  how  each  component  of  the  specification’s  memory  can  be  derived  from  the  implementation’s  rep¬ 
resentation.  M  indicates  the  specification’s  representation  of  memory,  I  have  used  the  component  names  of 
the  implementations  memory  directly.  iF'irst  consider  the  components  of  M  which  are  sets. 

M.roots  =  {a  €  [0..maxjoot8)  |  rootsfa]  !=  aNil  } 

M.uncopied  =  {a  €  [0..aUoc)  j  from[a].tag  =  CONS  } 

M.unscanned  s  [scanned.. unscanned) 

M.scanned  =  [0..scanned) 

M.mem  consists  of  the  map  that  maps  all  the  addresses  in  M.uncopied  to  the  nodes  in  the  from  array 
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at  thoaa  addnwaa  and  all  the  valid  addreaMa  in  the  to  an^r  to  the  nodea  in  the  to  array. 

V  a  €  M.uncopied  .  M.mem[a]  =  &om[a] 

V  a  €  [0..unacanned)  .  M.mem[a]  =  to[a] 

Finally  M.forward  consista  of  the  map  which  mape  all  the  addtesaea  in  the  from  array  which  refer  to  for¬ 
warded  nodea  to  the  addreaaea  in  thoae  nodea  fwd  field: 

V  a  6  {b  €  [0..aUoc)  |  firom[b].tag  =  FWD}  .  M.forward[a}  =  &om[a].fwd 


4.2.  gc 

This  section  begins  the  top  down  presentation  of  the  code  and  the  informal  proof  of  correctness.  The 
implementation  itself  is  very  simple  and  will  not  be  commented  on  extensively.  The  arguments  that  the 
invariant  isValidMemory  is  maintained  have  been  omitted  as  they  are  obvious  but  long  and  tedious. 


void  gc(void) 

•C 


f orwardKoots ( ) ; 
seanUnscannedO ; 


To  show  that  ge  satisfies  its  specification  (figure  6)  the  following  must  be  true:  the  pre-condition  of  gc 
implies  the  pre-condition  of  forwardRooU,  the  post-condition  of  forwardRoots  implies  the  pre-condition  of 
acanUnaeanned,  and  the  poet-condition  of  acanUnacanned  implies  the  post-condition  of  gc. 

The  first  point  is  trivial,  since  the  pre-condition  of  gc  is  the  same  as  the  pre-condition  of  fonoardRoota. 
The  second  point  follows  directly  from  the  fact  the  first  three  copjuncts  of  the  poet-condition  of  forwardRooU 
are  the  same  as  the  first  three  coqjuncts  of  the  pre-condition  of  acanUrucanned  and  the  last  conjunct  of  the 
post-condition  of  forwardRooU  (memEquiv(mem*,  mem’))  directly  implies  the  last  coqjunct  of  the  pre¬ 
condition  of  acanUnacanned  (isValidMemory(mem*)).  The  mem’  in  the  post-condition  of  forwardRooU  is 
the  same  as  mem*  in  pre-condition  of  acanUnacanned. 

The  final  point  is  also  straightforward.  Let  the  state  of  memory  before  any  execution  be  m,  after  ex¬ 
ecuting  forwardRooU  be  m’,  and  after  executing  acanUnacanned  be  m”.  Af^r  expanding  iaPallGC  and 
iaFinalGCMemorg,  adding  some  facts  from  the  post-condition  of  forwardRooU,  and  eliminating  any  con- 
juncts  which  follow  directly  from  the  pre-conditions,  it  must  be  shown  that: 

{}  =  rooiaUnforwarded{m')  A  m'.roata  =  m'.unacanned 
A  {}  =  m'.acanned  A  memEquiv(m,  m')  A  m'.roata  =  m".roata 
A  m".nnaeanned  =  {}  A  memEquiv{m' ,  m") 

A  addraEquiv{reachable{m' .roata,  m'),m" .acanned,m' ,  m") 

^  memEquiv(m,  m") 

A  addraEquiv{reachable(m.roata,  m),  m"  .acanned,  m,  m") 

A  iaVaUdMemoryin^')  A  {}  =  m".macanned 
A  {}  =  roataUnforwarded(m") 

From  the  above  one  can  conclude  that  all  of  the  following  hold 
memEquiv{m,  m')  A  memEquiv{m'  ,m")  ^  memEquiv{m,  m") 

memEqrtiv{m,  m')  A  m'.roata  =  m". roata 
A  addraEquiv(reaehable(m'  .roata,  m'),  m". acanned,  m',  m") 

^  addraEquiv(reachable(m.roata,m),m".aeanned,m,m") 

memEquiv{m' ,m")  =>  {}  =  mf'.unacanned 
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{}  SB  roottUnforward«d(nf)  A  rrf  .roots  s  m". roots 
:^  {}  s  roatsU nfortaarded(m") 

and  thus  that  the  post-conditicm  of  sesnUnsesnned  implies  post^ndition  of  yc.  TherefoK  ye  satisfies  its 
specification. 


4.3.  flaalueGC 


▼old  linaliseOCO 

< 

addr  1; 

lor  (i  «  0;  i  <  nea. scanned;  !■•■+)  { 

■sa.lroaCi^  ■aen.toCij; 

} 

aea. alloc  •  aaa. scanned; 

aea.nezt_root  *  aea.  scanned  •  aea.nnseanned  »  0; 

} 


finalizeGC  is  tised  to  "^p”  the  spaces  after  gc  has  completed.  The  pre-condition  for  finalizeGC  is  satisfied 
if  it  follows  gc.  Fat  finalizeGC  to  satisfy  its  specification  (figure  7)  the  post-condition  (isInitialMemory(mem’) 
A  mem'.scanned  s  mem'.uncopied  A  mem'.roots  =  mem’.roots  A  mem'.mem  =  mem’.mem)  must  hold  after 
finalizeGC  executes. 

The  for  loop  copies  scanned  to  uncopied  without  changing  any  addresses,  satisfying  mem*.scanned  = 
mem’.uncopied  and  mem'.mem  =  mem’.mem.  The  roots  are  not  changed,  so  mem'.roots  =  mem’.roots 
holds.  isJnitialMemorg  holds  for  the  the  following  reasons.  Setting  scanned  and  unscanned  to  0  means  the 
scanned  and  unscanned  sets  are  empty.  None  of  the  nodes  which  were  in  mem.to  and  which  were  copied 
into  mem.from  had  a  tag  FWD,  so  the  forwarded  map  is  empty.  In  a  more  typical  implementation  the  copy 
probably  would  not  be  done,  the  “flip”  might  be  accomplished  purely  by  changing  pointers. 


4.4.  forwardRoots 


▼old  lorwardKoots(Toid) 
addr  r; 

while  ((r  «  nextUnlorwardedRootO)  !«  allL)  < 
lorwardKoot  Addr  (*MB .  roots  Cr] ) ; 

> 


To  show  that  forwardRoots  satisfies  its  specification  (figure  9),  it  must  be  shown  that  assuming  the  pre¬ 
condition  and  loop  termination  then  the  post-condition  is  satisfied  (partial  correctness),  and  that  the  loop 
terminates.  Showing  partial  correctness  of  the  loop  requires  a  loop  condition  (LC),  and  loop  invariant  (LI), 
while  showing  loop  termination  requires  a  metric  (M)  whidi  decreases  monotonically  with  each  iteration  of 
the  loop. 

LC  ==  {}!  =  roatsUnforwarded{mem') 

LI  ==  TnemEq‘uiv(mem~ ,mem')  A  {}  =  memf. scanned 
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A  mem' .untcanwd  C  mem'. note 
M  ==  eixe(roat»U  fi/orttforded(m«m'))[>=  0] 

LI  i«  true  before  the  loop  executes  leeuming  the  pre-condition  because 
mem*  =  mernf  ^  memEquiv(mem‘,  mem') 
ieIniti<dMemory{mem)  =>  {}  as  mem.eeanned 
ieInitialMemory(mem)  ^  {}  as  m.uneeanned 
{}  s  m.unecanned  ^  mem.uneeanned  C  mem.rocte 

LI  implies  that  the  pre-condition  for  nextUnforwardeiRoot  holds,  and  the  post-condition  of  nextUnforward- 
edRoot  guarantees  that  either  the  pre-condition  for  fomerdRootAddr  holds,  or  that  the  loop  terminates. 

The  post-condition  of  forwerdRootAddr  along  with  the  fact  that  a  is  an  unforwarded  root  implies  that 
LI  remains  true  because: 

{}  a:  mem‘ .scanned  A  mem'. roots  =  {mem‘. roots  —  (*0)*)  U  (*0)' 

A  isForv>ardSte‘p{{*ay,  (*0)',  mem",  mem') 

^  memEqviiv{mem~ ,mem')  A  {}  =  mem'. scanned 
A  mem' .unseanned  C  mem'. roots 


If  the  loop  terminates  then  -iLC  A  LI  holds,  which  satisfies  the  post-condition  of  forwardRoois  because: 
-<LC  A  LI  =sa  {}  =  roatsUnforwarded{rnem') 

A  memEquiv{mem‘,  mem') 

A  {}  =  mem'. scanned  A  mem' .unseanned  C  mem'. roots 

{}  as  rootsUnforu>arded{mem')  A  {}  =  mem'. scanned 
so  mem'. roots  C  mem' .unseanned 

mem'. roots  C  mem' .unseanned  A  mem' .unseanned  C  mem'. roots 
so  mem'. roots  =  mem' .unseanned 

The  loop  terminates  because  each  time  through  the  loop  forwerdRootAddr  causes  M  to  decrease.  When 
it  reaches  0  the  loop  terminates. 


4.5.  nextUnforwardedRoot 


addr  nsxtOaior«ardsdKoot(xold)< 

while  ((stsa.  roots  Dissi.  next  .root]  »  allL)  M 
(■SB. next .root  <  uaxIuaRoots))  < 

■esi .  next.root+4' ; 

> 

if  (uesi.  next  .root  >■>  ■axlMRocts)  return  allL; 
return  nen. next .root; 

> 


The  specification  for  nextUnforwardedRoot  is  found  in  figure  10.  The  code  loops  through  the  roots,  until 
it  either  finds  an  entry  which  is  not  aNil  which  it  then  returns,  or  it  runs  out  of  roots  in  which  case  it  returns 
aNil.  The  result  is  an  unforwarded  root  if  one  remains  and  aNil  otherwise,  thus  satisfying  the  post-condition. 
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4.9.  ferwardRootAddr 


void  foxmzdloetiddrCaddr  *r){ 

us«rt(r  ■>  tB«i.rootsDi«i.a«zt_root]) ; 
fervardiddrCr) ; 

. a«zt_root>+ ; 

> 


The  spediication  for  forwrdRootAiiria  found  in  figure  11.  The  aaaert  makes  sure  that  forwardRooiAddr 
is  in  fact  called  with  the  next.toot  so  that  incrementing  next  joot  correctly  reflects  the  fact  that  r  has  been 
forwarded.  The  pre-condition  for  forwmrdAddr  is  satisfied,  and  furthermore  the  invariant  guarantees  that 
forwardAddr  has  been  called  with  an  unforwarded  address.  Executing  forwardAddr  implies  that  isFomard- 
Step  holds,  and  modifies  *r,  which  means  the  old  address  is  effectively  removed  from  the  roots  and  the  new 
one  added,  so  the  post-condition  holds.  Incrementing  next  joot  maintains  the  invariant  involving  which 
roots  have  been  forwarded. 


4.7.  scanUnscanned 


void  scanOhseanaedCvold) 
addr  n; 

ehilo  ((n  ■  naxtOnseannedlodeO)  !«  allL)  { 
seanAddr(a) : 

> 

> 


Showing  that  teenUtueenned  satisfies  its  specification  (figure  12)  requires  showing  both  partial  correct¬ 
ness  and  loop  termination,  assuming  that  the  pre-condition  for  scanUnscanned  holds.  The  loop  condition 
(LC),  and  loop  invariant  (LI),  and  a  monotonically  increasing  metric  (M)  are: 

LC  ==  mem'. unscanned!  =  {} 

LI  ==  mem'.roois  = 

men^. roots  A  memEqwv(mem',  mem')  A  addr sEquiv{reachable{mem' ,  mem' .roots), 

reaehable{mem'  ,copiedNodes{mem')),  mem',  mend) 

M  ==  size{nodesScanned{mem))[<^  size(allNodes(mem))) 

Before  the  loop  executes  LI  holds  since 
mem'  =  mem'  ^ 

mem'. roots  =  mend  .roots  A  memEquiv(mem~,  mend) 

{}  =  mem''.scanned  copiedNodes{mem')  =  unscanned 

mem'.roois  =  mem'.unseanned 
^  addrsEquiv(reaehaMe{mem' ,  mem" .roods), 
reaehable{mem'  ,copiedNodes{mem')),  mem',  mem') 

LI  satisfies  the  pre-condition  for  nextUnscannedNode.  The  poet-condition  of  nextUnseannedNode  along  with 
LI  sattsfles  the  pre-condition  for  scanAddr.  The  post-condition  of  scanAddr  implies  LI  since  isScanSiep 
implies  that  the  roots  stay  constant  and  that  the  memories  ate  equivalent  and  the  reachability  condition  is 
an  explicit  part-qf  the  post-condition  of  scanAddr. 

If  the  loop  terminates  then  ->LC  A  LI  hold  and  the  following  parts  of  the  post-condition  for  scan  Un¬ 
scanned  can  easily  be  discharged: 
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mem'.root*  s  mem'. root*  ^  mem'.roote  =  mem'. roots 
men^ .uneeannedi  s  {}  ^  mem!  .unseanned  —  {} 
memEfttio{mem‘ ,  mem')  ^  memEqmv{mem' ,  merrf) 

I  can  simplify  the  remaining  coiyunet  of  the  post-condition  by  noting:  mem'.unseanned  —  {} 

f^addrsEquiv{reaehable{mem‘',mem‘.ro<As), 

reaehahte{meiid,eopiedNodee(mem')),mem‘,mem')  => 

addrsEquiv(reaeluMe{mem‘,  mem'. roots), 

reaehakle{mem' ,  mem'  .scanned),  mem',  mem') 

Leaving  us  to  show  that 

LC  A  LI  ^  reaehabie(mem' ,  mem' .scanned)  =  mem'. scanned 

Each  element  in  mem’acanned  satisfies  isSeannedAddr  which  means  all  of  its  pointers  satisfy  isFor- 
wardedAddr  and  thus  are  either  in  mem’.scanned  or  mem’.unscanned.  But  mem’.unscanned  is  empty,  so 
every  address  referenced  by  an  address  in  mem’.scanned  must  in  mem’.scanned  as  well.  This  means  reach- 
able(mem’,  mem’.scann^)  =  mem’-scanned. 

The  loop  terminates,  because  each  execution  of  seanAddr  adds  a  node  to  the  scanned  set,  and  the  number 
of  nodes  which  can  be  added  to  the  scanned  set  is  bounded  by  the  total  number  of  nodes. 


4.8.  nextUnscannedNode 

addr  naxtUnaeannedlodef )  -C 

if  (nan. scanned  >*  nesi.nnscanned)  return  allL; 
return  nen. scanned; 

> 


The  specification  of  neetUnseennedNode  is  found  in  figure  13.  As  captured  in  the  abstraction  function 
unscanned  =  [scaimed..un8caimed).  Thus  if  mem.scaimed  >=  mem.unscanned  then  {}  =  unscanned,  and 
aNil  should  be  returned.  Otherwise  an  element  of  unscanned,  mem.scanned,  is  returned  as  required  by 
the  post-condition.  The  specification  could  be  satisfied  by  returning  any  unscanned  element,  but  this  im¬ 
plementation  manages  the  unscaimed  set  as  a  queue,  with  nextUnscannedNode  returning  the  head  of  the 
queue. 


4.9.  seanAddr 

void  scanAddrCaddr  a){ 
assertCn  non. scanned) ; 
forwardiddrCtesn.toCn] .car) ; 
lorwardAddrCtaesi.toCh]  .cdr) ; 

nen. acaanad-M> ; 

> 


The  specification  of  seanAddr  is  found  in  figure  14.  The  assert  makes  sure  that  n  is  the  location  of  the 
first  element  of  the  unscanned  set  and  thus  that  incrementing  scanned  moves  the  node  located  at  n  from 
unscaimed  to  scanned.  The  pre-condition  for  each  forwerdAddr  is  satisfied  and  the  invariant  guarantees 
that  each  is  called  with  an  unforwarded  address,  since  all  nodes  at  addresses  at  or  above  mem.scanned  are 
guaranteed  to  contain  only  unforwarded  addresses.  The  post-condition  for  forwerdAddr  implies  that  both 
the  car  and  the  cdr  are  forwarded  and'  that  memEqniv  holds.  Since  forwerdAddr  only  modifies  the  address 
passed  to  it>  the  roots  are  unchanged.  Incrementing  scaimed  moves  n  into  the  scanned  set  without  changing 
the  rest  of  memory.  Thken  together  the  last  three  points  mean  that  isScenStep  holds.  The  reachability 
condition  is  satisfied  because  n  is  in  the  copied  set  and  the  two  forwerdAddr*  at  most  add  the  car  and  cdr 
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to  Um  copied  eet  to  tbe  nodee  leachable  fi«mi  the  copied  set  an  not  changed.  The  invariant  ia  maintained 
becauee  the  addteeaee  in  n  ate  now  forwarded,  and  scanned  greater  than  n,  indicating  that  n  ia  in  the  scanned 
set. 


4.10.  fcHTwardAddr 

void  forvazdAddrCaddr  eaX 

if  (nen.fronCea]  .tag  l«  FIID)  eopjAddr(*a) ; 
*a  «  neai.froBC^a]  .fad; 

} 


The  specification  of  forwardAddr  is  found  in  figure  16.  In  this  implementation  forwardAddr  is  called  only 
with  unforwarded  nodes  since  in  both  places  forwardAddr  is  used,  the  invariant  states  that  the  addresses  are 
unforwarded.  The  specified  interface  ia  more  general  to  allow  the  specification  to  be  more  broadly  applicable. 
Given  that  a  is  unforwarded,  forwardAddr  must  ensure  isForwardStep.  If  the  node  has  not  been  copied,  then 
its  tag  is  CONS  and  the  pre-condition  for  eoppAddr  holds.  This  along  with  the  post-condition  of  copyAddr 
implies  that  the  (addrUneopied(a,m)  =>  itCopjfStep{a,m,m'))  A  iaCopiedAddr{a,a' ,m,mf)  con- 
juncts  of  iaForwardStep  hold.  If  the  tag  is  FWD  then  iaCopiedAddr{a,a' ,m,m')  already  holds.  The 
pointer  update  makes  m’.forwBrded[a]  =  a’  hold.  Neither  of  these  things  changes  the  equivalence  between 
memories.  They  also  do  not  change  the  scanned  set,  so  iaForwardStep  holds  and  forwardAddr  satisfies  its 
specification. 


4.11.  copyAddr 

void  copyAddrCaddr  a)  { 

■sa.toCaMB.unscannsd]  •  noi.froaCa] ; 
■sa.froBCa] .tag  »  FVD; 

■sa.froaCa]  .fvd  «  asai.nnscanned; 
am .  unseannsd-t’-f ; 

> 


The  specification  of  copyAddr  is  found  in  figure  18.  addrFree  is  satisfied  because  the  node  is  copied  to 
mem.unscuned  which  points  to  a  free  location,  addrUnforwarded  is  satisfied  because  mem.unscanned  is 
not  forwarded.  The  toots  and  scanned  sets  are  unchanged.  Setting  mem.&om[a].tag  =  FWD  removes  the 
node  from  uncopied.  Incrementing  unscanned  adds  the  new  address  to  unscanned.  Cc^ying  the  node  to 
unscanned  rebinds  it  in  memory.  Finally,  mem.Grom[a]iwd  =  mem.unscanned  adds  the  new  address  to  the 
forwarding  map.  None  of  this  changes  the  equivalence  of  the  memories.  Choosing  unscanned  as  the  location 
to  copy  the  node  to  completes  the  breadth-first  management  of  the  unscanned  set,  with  unscanned  acting 
as  the  tail  of  the  queue.  Since  the  node  is  copied  to  a  location  at  or  above  scanned,  and  it  contains  only 
unforwarded  addresses,,  the  invariant  holds. 


5.  Applicatioii  to  Other  Garbage  Collectors 


The  implementation  in  Section  4  shows  in  detail  how  the  specification  applies  to  a  simple  two-space  copying 
collector.  It  also  ^plies  to  other  copying  baaed  collectors  including  generational  copying  collectors,  incre¬ 
mental  copying  collectors,  and  collectors  which  do  not  use  breadth-first  traversal  of  the  node  graph.  JThis 
section  considers  how  the  specification  applies  to  these  variations  of  CGC. 
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5.1.  GeoArational 


Generational  collectors  attempt  to  mininnw*  the  cost  of  garbage  collection  by  concentrating  their  efforts  on 
those  portions  of  memory  that  are  most  likely  to  contain  garbage.  Typically  these  are  the  portions  of  memory 
that  have  been  most  recently  allocated.  Generational  collectors  divide  data  into  a  number  of  generations 
that  group  the  data  by  how  old  it  is.  They  then  collect  younger  generations  more  frequently  than  the  older 
ones.  [11] 

The  specification  above  can  be  used  to  describe  generational  collectors  by  simply  choosing  what  to 
consider  as  the  roots.  For  simple  collectors,  the  roots  are  the  global  data  structures,  the  stack,  and  the 
registers.  For  a  generational  collector  the  roots  must  also  include  any  pointers  &om  other  generations  into 
the  one  being  collected.  Tracking  these  inter-generational  pointers  is  one  of  the  major  design  issues  in 
implementing  a  generational  collector,  but  lies  outside  the  scope  of  this  specification.  Given  a  set  of  roots 
that  includes  all  the  needed  inter-generational  pointers,  the  specification  can  stand  without  change. 


5.2.  Non-breadth  first 

Some  research  has  been  done  on  collectors  that  do  not  use  a  breadth-first  traversal  of  the  node  graph  [12]  [9]. 
The  intention  is  to  improve  locality  by  clustering  closely  connected  portions  of  the  node  graph.  Since  data 
are  accessed  by  following  pointers,  a  copying  strategy  that  copies  subtrees  of  the  graph  so  that  they  are 
physical  close  to  each  other  may  have  this  effect. 

These  collectors  can  still  be  described  with  the  specification  above.  Two  techniques  can  be  used  to  change 
the  order  in  which  nodes  are  copied.  One  technique  is  to  make  the  implementation’s  representation  of  the 
scanned  set  more  complicated  so  as  to  allow  nodes  above  the  unscanned  pointer  to  be  in  the  scanned  set. 
Since  the  specification  does  not  dictate  the  representation  of  the  scanned  or  unscanned  set  it  is  applicable 
to  this  technique.  The  other  technique  is  to  keep  the  scanned  and  unscanned  sets  representation  as  is,  but 
to  allow  references  in  nodes  in  the  unscanned  set  to  be  forwuded.  This  eager  forwarding  can  change  the 
order  of  copying  without  otherwise  changing  the  basic  algorithm,  as  long  as  forward^-ddr  can  ignore  already 
forwarded  references.  This  is  why  forwardAddr  is  specified  so  that  it  can  ignore  already  forwarded  references. 


5.3.  Incremental 

Incremental  collectors  work  by  interleaving  collection  with  mutation.  Recently  work  has  been  done  on  a  new 
incremental  copying  collector  that  has  some  unusual  properties  with  respect  to  the  handling  of  roots  [10]. 
When  an  incremental  collection  starts,  it  uses  the  roots  as  ‘Tiints”  about  what  to  copying,  but  does  not 
forward  them  since  that  would  violate  certain  invariants  needed  by  the  mutator.  As  the  collection  proceeds, 
the  collector  periodically  resamples  the  roots  for  new  parts  of  the  gr^h  to  be  copied.  When  the  incremental 
collection  is  completed,  the  roots  are  forwarded,  and  the  spaces  “flipped” . 

The  modifications  to  the  specification  to  accommodate  this  incremental  collector  would  be  more  extensive. 
The  roots  would  have  to  consist  of  the  union  of  all  the  roots  sampled  during  collection.  An  interface  to  allow 
the  copying  of  a  subset  of  the  roots  would  be  needed.  The  forwardRoots  interface  would  need  to  change  so 
that  it  could  forward  only  a  subset  of  the  roots.  The  overall  structure  would  need  to  change  to  allow  for 
repeatedly  copying  roots  and  the  scanning  the  unscanned  portions,  forwarding  roots  only  at  termination.  In 
addition,  many  of  the  proofe  of  correctness  would  need  to  change. 

I  mention  this  style  of  collection  because  while  working  on  the  specification,  I  realized  that  the  original 
incremental  collector  implementation  was  flawed  in  an  important  way.  All  the  roots  were  sampled  each  time 
the  incremental  collector  gets  control.  In  fact  for  correctness,  one  only  needs  to  guarantee  that  the  transitive 
closure  of  the  roots  at  a  flip  is  copied.  This  is  obvious  from  the  specification  otfuUGC.  One  still  must  sample 
some  of  the  roots  to  get  the  collector  started,  but  once  it  is  started,  one  only  needs  to  resample  when  trying 
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to  finiah.  The  sampliiigof  unneeded  roots  nugr  well  lead  to  data  being  copied  that  does  not  need  to  be.  Thu 
flaw  has  been  corrected. 


6.  Related  work 


Considering  its  importance,  there  are  surprisingly  few  published  attempts  at  formalizing  garbage  collection. 
Even  The  Definition  of  Standard  ML  [8]  a  formal  semantics  of  SML  contains  the  statement 

There  are  no  rules  concerning  disposal  of  inaccessible  addresses  ("garbage  collection”). 

The  notable  exception  to  this  lack  is  the  work  by  Demmers  et.  al.  [2].  Their  work  differs  from  mine 
in  several  important  ways.  First,  they  are  concerned  with  characterizing  what  data  is  preserved  by  a 
garbage  collector  (notably  conservative  and/or  generational  collectors),  rather  than  capturing  the  detaib  of 
a  particular  algorithm.  In  fact,  their  framework  should  apply  equally  to  CGC  and  MSGC,  although  in  their 
paper,  they  apply  it  primarily  to  MSGC.  In  their  terminology,  my  specification  models  a  precise  garbage 
collector,  that  is,  one  which  retains  exactly  those  nodes  reachable  from  the  roots.  They  are  concerned 
with  describing  imprecise  collectors,  that  is,  ones  which  may  retain  some  nodes  which  are  not  reachable 
from  the  roots.  They  show  that  such  imprecise  collectors  can  be  described  by  a  precise  collection  with  an 
augmentation  to  the  points-to  relation.  This  is  the  same  sense  in  which  my  speciflcation  models  generational 
GO,  by  augmenting  the  roots  with  the  needed  inter-generational  pointers.  Another  way  that  their  work  differs 
from  mine  is  in  presentation,  ii^  formalization  is  presented  in  terms  of  a  formal  specification  language,  while 
their  presentation  uses  more  conventional  mathematical  notation.  Finally  they  use  their  formalization  to 
describe  several  implementatioiu  at  a  relatively  high  level  of  detrul,  while  mine  is  used  to  prove  the  detailed 
correctness  of  a  single  simple  collector. 
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7.  Appendix 


7.1.  IMts 

Address  :  trait 

includaa  Sei(A,  SA) 


Node  :  trait 

includea  Address 
includea  Set(  Val,  SV) 
includea  Set^N,  SN) 

N  tuple  of  id  :  UID,  addrs  ;  SA,  vals  :  SV 


MemoryMain  :  trait 
includes  Node 

includes  FiniieMappingAux{ANMap,A,N,SA  for  SDomain) 

includes  FiniieMappingAux(AAMap ,  A,  A,  SA  for  SRange,  SA  for  SDomain) 

includes  TesiSeiJArg(isValidAddr,  isValidAddrSet,A,  SA,  M) 

M  tuple  of  roots  :  SA, 
uncopied  :  SA, 
unscanned  :  SA, 
scanned  :  SA, 
mem  :  ANMap, 
forwarded  :  AAMap 
introduces 

isValidMemorg  :  M  -*  Bool 
is ValidAddr  :  A,M  —*  Bool 
effectiveAddr  •.  A,M  —*  A 

asserts 

m  :  M,a  :  A,n  :  N 
is  ValidMemory{m)  == 

isOneToOne(m.mem)  A  isOneToOne(m.forwarded) 
AisValidAddrSet{m.Toots,  m) 

Am.uncopied  D  m.unscanned  =  {} 

Am.uncopied  D  m. scanned  =  {} 

Am.anscannei  fl  m.scanned  =  {} 

Am.uncopied  n  domain(m.forwarded)  =  {} 

Am.unseanned  U  m.scanned  =  range(m./orwarded) 

Am.uncopied  U  m.unscanned  U  m.scanned 
=  domain(m.mem) 

isValidAddr(a,  m)  ==  if  defined{m. forwarded,  a) 
then  defined(m.mem,m.forwarded[a]) 
else  defined(m.mem,a) 

effectiveAddr{a,  m)  ==  if  defined{m. forwarded,  a) 
then  m.forwarded[a] 
else  a  ~ 

implies 

converts  isValidMemorg,  isValidAddr,  isValidAddrSet,  effectiveAddr 


%  Sets  of  Addresses 


%  Sets  of  Values 
%  Sets  of  Nodes 


Equiv  :  trait 

includes  Mtmory 

inclndee  PairwiaeElement TestSArg( iaEquivAddr,  A,A,SA,SA,M,M, 
addraEfuiv  for  allPasg) 
introduces 

iaEqaivAddr  :  A,A,M,M  —*  Bool 
iaEqaivNode  Boot 

metnEquiv  :  M,  M  — »  Bool 

asserts 

V  m,  m'  ;  M, a,  s' :  A,n,n'  :  N 
iaEquivAddr(a,  a',m,  m')  == 

effeciiveAddr(a,  m')  =  effecUveAddr(a' ,  m') 
f\iaEquivNodt{nodeAtAddr(a,m),  nodeAtAddr{a' ,  m'),  m,  m') 

iaEquivNode(n,  n',m,  m')  == 
n.id  =  n'.ti 
An.vaU  =  n'.vala 

f\addraEqviv{n.addra,  n\addra,m,  m') 

memEquiv(m,  m')  == 

iaValidMemory(m)  A  iaValidMemory(m''' 
AaddraEquiv{m.roota,  m'.rooia,  m,  m') 
/^addraEquiv(allNodea{m),  allNodeaim'),  m,  m') 

implies 

converts  iaEquivAddr,  iaEqvivNode,  addraEquiv,  memEquiv 


Reachable  :  trait 

includes  Memory 
introduces 

reachable  :  S A,  M —*  SA 
ri:SA,SA,M-*SA 

asserts 

V  m  :  M,a  :  A,aa,aai,  osj  :  SA 

reachable{aa,m)  ==  ri({},a«,m) 

ri(as,{},m)  ==  aa 
ri(asi,  inser<(a,  asj),  m)  == 
ri(inaert(a,  Mi), 

(mj  U  (m.mm[o]).a<i<lrs)  —  ifwer<(a,  asi),m) 

implies 

converts  reachable, ri 
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GC  :  trait 

includw  Memory 
Includes  Equiv 
includes  Reoekable 
introduces 

iaFollGC  Bool 

ioInHialMemory  :  M  —*  Bool 
isFinalGCMemory  :  M  — *■  Bool 
asserts 

m,m* :  M 

iaPollGC{m,  m')  == 
iaInitialMemory{m) 

A  isFinalGCM emory(m‘) 

AmemEquiv(m,  m') 

AaddrsEquiv(reachable{m.TX>ots,m),  m' .scanned,  m,  m' 

isInHialMemory(m)  == 
u  ValidMemory(m) 

A{}  =  m.snscannei 
a{}  =  m.scanned 
A{}  =  m.forwarded 

isFinalGCMemoryljn)  == 
is  ValidMemoryim) 

A{}  =  m.nnscanned 

A{}  =  roots  Unforwarded{m) 

implies 

converts  isFullGC,  isInitialMemory,  isFinalGCMemory 


Scan  :  trait 

includes  Memory 
includes  Forward 
introduces 

isScannedAddr  :  A,  M  -*  Bool 

isScanStep  :  A,M,M  —*  Bool 

asserts 

'i  m,m'  ■.  M,a:  A 

isScannedAddr{a,  m)  == 
addrSeanned{a,  m) 
AuiForwardedAddr{a,  a,  m,  m) 
AisForwardedSei{(jn.mem[a]).addrs, 
(m.mefn[a]).a<l<lrs,  m,  m) 

isScanStep{a,  m,  m')  == 
memEqniv{m,  m') 

Am.rooU  =  m'.roois 
AaddrUnscanned(a,  m) 
AaddrScanned{a,  m') 
AisForwardedSei{{m.mem[a]).addrs, 
(m'.niem[a]).a(l(/rs,m,  m') 

implies 

'i  m,m'  •.  M,a:  A 
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i8SeanStep(a,  m,  m')  ^  uSeanneiAddr{a,  m*) 


converts  itScannedAddr,  iaSeanStep 


Forward  :  trait 

includea  Memory 
include#  Copy 

include#  PairwueElemeniTetitAry(uForwardedAddr,  A,  A,  SA,  SA,M,  M, 
isForwardedSet  tor  allPaaa) 
introduces 

isForwardedAddr  :  A,  A,M,M—*  Bool 
isForwardStep  :  A,  A,  M,  M  — ►  Bool 

asaert# 

V  m,  m'  :  M,  a,a':A 

iaForwardedAddr(a,  a',  m,  m')  == 
iaCopiedAddr{a,  a',  m,  m') 

AeffeciiveAddr{a,  m')  =  a' 

taForwardSiep(a,  a',  m,  m')  == 
memEquiv(m,  m') 

AaddrUnforwarded(a,  m) 

AaddrForwarded{a' ,  m') 

A(addrUncopied(a,m)  =>  isCopyStep{a,m,m')) 
f\isCopiedAddr{a,  a',m,  m') 

Am' .forwarded[a]  =  a' 

Am.acanned  =  m'. scanned 

implies 

'i  m,m'  :  M,  a,a':A 

isForwardStep^a,  a',  m,  m')  =>  isForwardedAddr{a,  a',  m,  m') 
converts  isForwardedAddr,  isForwardStep 


Copy  :  trait 

includes  Memory 
includes  Equiv 
introduces 

isCopiedAddr  :  A,  A, M,M  —*  Bool 
isCopyStep  :  A,  M,M  —*  Bool 

asserts 

'i  m,m'  :  M,  a,a'  -.A 

isCopiedAddr{a,  a',  m,  m')  == 
isEqnivAddr{a,  a',m,  m') 
AaddrCopied(effeciiveAddr(a',  m'),  m') 

isCopyStep{a,  m,  m')  == 
memEqrtiv{m,  m') 

_  AaddrUncopied{a,m) 

AaddrFree{{m'  .forwarded[a]),  m) 

A  addrUnforwarded{m'  .forwarded  [a] ,  m) 
Am'  =  [m.roots, 
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delete(a,  m.*»eopied), 
in$*rllm\forwidei[a],  m.%uscanned), 
m.Mcanned, 

reiind(m.mem,a,  m'./orturded[a]), 
bind{fn.foTwarded,  a,  m' .f<yrwardtd\a!^ 

impllM 

'i  m,m‘  :  M,a  :  A 

uCoptiSiep{a,m,  m')  =>  uCopiedAddr{a,a,m,  m') 
eonverto  isCopiedAddr,  iaCoppSitp 


These  aie  some  of  the  less  important  traits  which  were  not  discussed  in  the  text. 


Memory  :  trait 

includes  MemoryMain 
includes  Memory  Auxiliary 


Memory  Auxiliary  :  trait 
includes  MemoryMain 
includes  SeiOps 

includes  ElemeniTeat{addrUnforwarded,A,SA,M,  addraUnforwarded  tor  filter) 
introduces 

nodeAiAddr  :  A,M  N 
allNodea  :  M  ^  SA 
copiedNodea  :  M  S A 

addrPree  :  A,M  —*  Bool 
addrCopied  :  A,M  —*  Bool 
addrUncopied  :  A,  M  -*  Bool 
addrUnacanned  :  A,M  —*  Bool 
addrScanned  :  A,M  Bool 
addrForwarded  :  A,M  —*  Bool 
addrUnforwarded  :  A,M  —*  Bool 
addrRoot  :  A,M  —*  Bool 
rootaUnforwarded  :  M  —*  SA 
asserts 

'i  m  :  M,a  :  A,n  :  N 

nodeAtAddr{a,m)  ==  m.mem[effeciiveAddr{a,m)] 

allNodea(m)  :x=  m.uncopied  U  m.unscanned  U  m. scanned 

copiedNodea{m)  ==  m.anscanned  U  m.scanned 

addrFree{a,m)  ==:  -<defined(m.mem,a) 

addrCopied{a,  m)  =s:  a  €  copiedNodea{m) 

Wdefined{m. forwarded,  a) 
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€iirUncofiei{a,m)  ss  a  €  m.Mneopied 

addrUnseanned(a,m)  ==  a  €  m.utueanned 

addrScanned{a,  m)  ==  a  6  m.seanned 

addrForwarded{a,  m)  ==  a  €  copiedNodea{m) 

addrUnfoTwarded(a,m)  =:=  a  €  m.nneopied 
y  defintd{m. forwarded,  a) 

addrRooi(a,  m)  ==  a  €  m.rooU 

roots  Unforwarded{m)  ==  addraVnforwaTded{m.roots,m) 

impU«s 

converts  no^eAMitfr,  allNodes,  copiedNodes,  addrPree,  addrCopied, 
addr Uncopied,  addrUnscanned ,  addrSeanned,  addr Forwarded, 
addrUnforwarded,  addrRooi,  roots  Unforwarded,  addrsUnforwarded 


FiniteMappingAux(Map,  Domain,  Range)  :  trait 
includes  FiniteMap(Map,  Domain,  Range) 
includes  Set{Domain,SDomain) 
includes  Set{Range,SRange) 
introduces 

— j :  Map,  Domain  —*■  Range 
unbind  :  Map,  Domain  — »  Map 
rebind  :  Map,  Domain,  Domain  — >  Map 
isOneToOne  :  Map  — »  Bool 
bound  :  Map,  Range  — »  Bool 
domain  :  Map  —*  SDomain 
range  :  Map  —*  SRange 
asserts 

V  m  :  Map,d,di,di  :  Domain, r,ri,r2  :  Range 
mid]  =  applp{m,  d) 

==  {} 

unbind(bind(m,  di,  r),d2)  == 
if  di  =  d}  then  m 
else  bind(unbind(m,d3),di,r) 

rrbind({},di,d3)  =  {) 
rebind(bind(m,  d,  r),  di ,  dj)  == 
if  d  =  di  then  bind(m,d2,r) 
else  bind{rebind{m,d\,d2),d,r) 

isOneToOne({}) 

isOneToOne(bind(m,d,r))  ==  -ibound(m,r)  A  isOneToOne(m) 
->bound({},r) 

bound(bind(m,d,ri),r3)  ==  n  =  rj  V  6o«ad(m,ra) 
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=s=  {} 

domain{Hnd{m,  d,  r))  ==  tMer<(<f,  <fomatii(m)) 
nm»e({})  ==  {} 

range(bind(m,d,r))  ==  iiuert(r,  range{Tn)) 

impliM 

V  m  :  Map,di  :  Domain 

-fdefined{unbind(m,  di),dt) 

converts  unbind,  rebind,  iaOneToOne,  bound,  -{-j,  domain,  range 


PairvtiaeElementTeittArg{pasa, Ei,Ei,Si,Si,Ti, T^)  :  trelt 
assumes  Set{Ei,S\) 
assumes  Sei{E2,S2) 
introduces 

pass  :  El ,  Et  i  Ti  i  la  “♦  Bool 
allPass  :  Si,Si,Ti,T2  Bool 
oneP asses  '■  Ei,Sj,Ti,T2  —*  Bool 
removePassing  :  Ei,Si,Ti,Ti  —*  Si 

asserts 

V  «i  :  5i,sa  :  ^Ji^i  :  Ei,ei  :  Ei,ti  :  Ti,ti  :  7a 
a/IPas«({},{},ti,la) 
allPass{inseH{ei,si),Si,ti,ti)  == 
onePasses(ei ,  S]  ■  ti ,  ta) 

A,allPass(si ,  removePassing(ei , S3,ti,ti), li,  ta) 

-•onePasses(ei,  {l.ti.tj) 

0nePasses(ei,  ifMcrf(eai<3)><ii^3)  == 
P<us(ei,e2,ti,ti) 

VonePasses(ei,S3,ti,t3) 

removePassing  1,1])  ==  {} 

removePassing(ei,insert(e3,S3),ti,t3)  == 

if  pas«(ei,ea,ti,ta) 

then  S3 

else  inseri(e3,  removePassing(ei,S3,ti,t3)) 

impiies 

converts  allPass,  onePasses,  removePassing 


TestSetlArg(eUmOp,  setOp,  E,  SE ,A)  :  trait 
assumes  Set{E,  SE) 
introduces 

setOp  :  SE,  A  -*  Bool 
elemOp  :  E, A—*  Bool 
asserts  'i  e:  E,se:  SE,  a  :  A 
seiOp{{},a) 

sei0p(inser1(e,  se),  a)  =:=  setOp(se,  a)  A  elemOp{e,  a) 
implies  converts  seiOp 
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Thflw  are  traita  from  the  LSL  handbook^. 

Sei{E,C)  :  trait 
ineludea 

SetBaries, 

Nai%rtl{N), 

DeriveiOrdera(C,  C  for  <,  D  for  >,  C  for  <,  D  for  >) 
introducea 

delete  •.£,€—*€ 

{-}:E-*C 

_U  _,_n  c.c -» c 

size  :C  N 

aaaerts 

V  e, fill C]  :  E, e,si,S2  'C 

{«}  ==  ineert(e,  {}) 

Cl  6  deleie(e2,e)  ==  ei  ^  cj  A  ei  €  « 
e  €  (*1  U  ej)  ==  e  €  ei  V  e  € 
e  €  (*i  n  aj)  ==  e  6  ai  A  e  €  *j 
e  €  («i  -  «a)  ==  e  €  ai  A  e  ^  aj 
«««({})  ==  0 

at2e(inaer<(e,  a))  ==  if  e  ^  a  then  atre(a)  +  1  else  5tze(a) 

*1  Q  ==  —  *2  =  {} 

implies 

AhelianMonoidiyi  for  o,  {}  for  «ntt,C  for  T), 

AC{r\,C), 

7oinOp(U), 

MemberOp, 

PartialOfder(C,  £  for  <,  2  for  >,  C  for  <,  D  for  >), 
UnorderedConiainer 
C  generated  by  {},{_},  U 

V  e  :  E,$,$i,82  :  C 

iiuert(e,a)  #  {} 

ifuert{e,  inseri{e,  a))  ==  tnaert(e,  a) 

*1  £  »2  ==  «i  —  *2  =  {} 

converts  €,  {— },  delete,  size,  U,  n,  — ,  C,  D,  C,  D 


SetBasies(E,C)  :  trait 
introduces 

{}  :-C7 

insert :  E,C  —*C 

^  —  :^,C— »  Bool 

asserts 


C  generated  by  {},  inaert 
C  partitioned  by  € 

V  a  :  C,e,ei,ej  :  E 
e  ^  a  ==  ->(e  €  a) 

eii) 

et  6  inaert(ej,a)  ==  ei  =  ej  V  ei  €  a 

implies 

UnorteredContainer, 

MemberOp 


>  Copyright  ®  1991  J.V.  Gnttag  and  Digital  Eqaipment  Corporation. 


31 


e,et,e2  :E,$  :C 
i%$eri{e,a)  ^  {} 

tMerf(e,  inaert{e, «))  s=  nutrt{e,  a) 
comrnrta 


SeiOpa  :  trait 
aMumaa 

CoMntahle, 

StiBaaica 

includfls  CoUectionOpaiJalae  for  dnpa) 
introducea 

delete  :  E,C  —*C 
_U_,_n_:C,C-*C 

aaaerts 

V  e,ei,e3  :  E,a,ai,ai  :  C 

ei  €  deleie{ei,a)  ==  ei  03  A  ei  €  a 
e  €  (ai  U  aj)  ==  e  €  ai  V  e  G  <3 
e  €  («i  n  *3)  ==  e  €  «i  A  e  €  <3 

e  6  («i  —  *3)  ==  e  €  Si  A  e  ^  <3 

implies 

AbelianMotioid(iJ  for  o,  {}  for  ttnti,C  for  T), 

AC(n,C), 

JotnOp(U), 

PartialOrder(C,  C  for  <,  D  for  >,  C  for  <,  3  for  >) 

C  generated  by  {},  {— },U 

V  e  ;  £?,s,si,S3  :  C 

stze(ttMer((e,s))  ==  if  e  €  a  then  aize(a)  else  sacc(5ue(s)) 
«i  Q  ==  *1  -  »3  =  {} 

converts  €,  {— },  delete,  aize,  U,  n,  — ,  C,  D,  C,  3 


ElementTeat(paaa,E,C,T)  :  trait 
assumes  Container 
introduces 

pass  •.  E,T  —*  Bool 
aomePaaa  :C,T  —*  Bool 
allPaaa  :C,T  —*  Bool 
fiHer:C,T^C 
asserts  'i  e  :  C,e  :  E,t  -.T 
->somePass({},i) 

aomePaaa{inaert{e,  e),  i)  ==  pass(e,  i)  V  3omePaaa(e,  t) 
allPttaa{{},t) 

allPaaa{inaert{e,e),t)  ==  paaa{e,t)  A  allPaaa{e,t) 

fiUer({},t)^={} 

fiHer{inaert{e,  e),  t)  == 

if  pass(e,i)  then  inaert{e,filier{e,t))  else  filter{c,t) 
implies  converts  aomePaaa,  allPaaa,  filter 
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FiniteMap(M,  D,  R)  :  trait 
introdueas 
0 

applf  :  M,D  -*  R 
defined  :  M,D  -*  Bool 

aaaerta 

M  ganaratad  by  {}i  Und 
M  partitionad  by  app/y,  defined 
V  m  :  M,d,di,di  :  D,r  :  R 

applp(Hnd{m,d2,r),di)  ==  if  di  =  d}  then  r  else  app/p(m,  di) 
~<defined{{},d) 

defined{bind{m,d2,r),di)  ==  (di  =  dj)  V  defined{m,  d\) 

implies 

converts  apply,  defined 

exempting  V  d  :  Dspp/p({},d) 


7.2.  Interfaces 
imports  base; 

nsas  GCCaaaory  for  N,  addr  for  A  ); 

void  go (void)  memory  mam;  { 

racpiiras  islait ialMamory (mam* ) ; 

modifies  mam; 
aasnras 

isFollGCCmam'*,  mam*) 

/\  isFinalGCMaaoryCmam’ ) ; 

} 


imports  bass; 

uses  GC (memory  for  N,  addr  for  A); 

void  f  inaliza6C(vold)  memory  mam;  { 
requires  isFlnalGCNamory(mam') ; 

modifies  mem; 
ensures 

isInitiallfemory(mem’ ) 

A  mam*. scanned  »  mem' .uncopied 
A  mem*. roots  ■  mM*. roots 

A  mem*. mem  »  mem' .men; 


imports  base; 

uses  6C(nemory  for  N.  addr  for  A); 

void  forward&oots(void)  UMmry  mem;  { 
requires  isInitialMemoryCmem*); 

modifies  mam; 
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•anrM 

O  •  rootsltalorwardMl(B«i* ) 
A  a«i*  .roots  •  .tnseaiuiod 
AO*  MB'.scsnaod 
A  asaBqiilTCBSB* ,  bob'): 


iaports  bass; 

sddr  iiaztDblorwardodRoot(Toid)  BOBory  bob;  < 
roqoiros  isVolldNoBory(Boa*) ; 
onsoros  if  O  ■  rootsUnlonardodCBoa* ) 
thoB  rosnlt  *  allL 

olso  rosnlt  \in  rootsUnlonardodCBOB*) ; 


nsos  Forward(BOBory  lor  N,  sddr  lor  A  ); 

▼old  lorwsrdRootAddrCaddr  «a)  aoaory  bob;  { 
roqniros 

IsValidMoBoryCBOB")  A  (*a)‘  Nin  rootsUnlomardodCaoB*) ; 

Bodilios  BOB,  *a; 
oBsnros 

B«i*. roots  *  (bob*. roots  -  <(*a)*>)  \U  {(oa)'} 

A  isForwardStop((*a)*,  (*a) * ,  bob*,  bob'); 


iBports  baso; 


▼oid  sca&(Iaseaiuiod(Told)  BOBory  bob;  < 
roepiiros 

O  *  rootstJxilorvardod(BOB*) 

A  BOB*. roots  *  BOB*.niisca2iiiod 
A  O  *  BOB* . scannod 
A  IsValldMoaoryCBOB*) ; 

Bodillos  bob; 
onsnros 

BOB*. roots  *  BOB'. roots 
A  BOB’ .nnscannod  ■  O 
A  BoaBqniT(BOB* ,  bob') 

A  addrsEqniv(roachablo(BOB*. roots,  bob*), 

BOB' .scannod,  bob*,  bob'); 


> 


iaports  baso; 

addr  aoxtOBseaanodlodo(Told)  Boaory  bob;  < 
roqniros  isValidMoBory(BOB*); 
onsnros  11  O  *  bob*. nnscannod 
than  rosnlt  ■  allL 
olso  rosnlt  \in  bob*. nnscannod; 


34 


ii^rts  baas; 

nsM  ScaaCaaaory  for  H,  addr  for  i); 

▼old  aeuOddrCaddr  a)  Boamry  Boai;  { 
raquiraa 

ia  ValidMaaory  (bm*  ) 

A  addrObaeaiiaodCa,  bob*) 

A  addraSqniTCroacliabloCBOB'.roota.  b«b*). 

raacliablo(eopiodlodoa(BaB*) , 

BOB*,  BOB*); 

Bodifioa  bob; 
onaTiroa 

iaScanStopCa,  bob*,  bob') 

A  addraEqiiiv(roacbablo(BOB*.roota,  bob*), 

roacbablo(eopiodIodoB(BOB').  boi'), 

BOB*,  BOB’): 

> 


iBporta  baao; 

aaoa  For«ard(BOBor7  for  M,  addr  for  A); 

void  forwardAddrC' idr  «a)  BOBory  bob;  -( 
roquiroa  isValidMoBory(BOB*) ; 

Bodifios  BOB,  oa; 

onaoroa 

if  iaForwardodAddr((*a)*.  (oa)',  bob*,  bob') 

thon  (*a)*  «  («a)’  A  bob*  =  bob’ 

olao  iaForvardStop((«a)*,  (*a)’,  bob*,  bob’); 


iBporta  baao; 

uaoB  Copy(BOBory  for  N,  addr  for  A  ); 

void  copyAddr(addr  a)  BOBory  bob;  < 
roquiroa 

iaValidMoBory (bob* } 

A  addrUncopiod(a,  bob*); 

Bodifioa  bob; 

oaauroa  iaCopyStopCa,  bob*,  bob’); 


This  is  base.lcl. 


abstract  typo  addr; 
constant  addr  allL; 
abstract  typo  BOBory; 

BOBory  bob; 

uaos  NoBory(BOBory  for  N,  addr  for  A  ); 
nsos  Roacliablo(BOBory  for  N,  addr  for  A  ); 
uaos  Bquiv(BOBory  for  M,  addr  for  A  ); 
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